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-  The  fully  nonparametric  formulation  of  the  empirical  Bayes  estimation  problem  con¬ 
siders  m  populations  characterized  by  conditional  (sampling)  distributions  chosen  indepen¬ 
dently  by  some  unspecified  random  mechanism.  No  parametric  constraints  are  imposed 
on  the  family  of  possible  sampling  distributions  or  on  the  prior  mechanism  which  selects 
them.  The  quantity  to  be  estimated  subject  to  squared-error  loss  for  each  population  is 
defined  by  a  functional  T(F)  where  F  is  the  population  sampling  cdf.  The  empirical  Bayes 
estimator  is  based  on  n  iid  observations  from  each  population  where  n  >  I.  Asymptotically 
optimal  procedures  for  this  problem  typically  employ  consistent  nonparametric  estimators 
of  certain  nonlinear  conditional  expectation  functions.  In  this  study  a  particular  projec¬ 
tion  pursuit  algorithm  is  used  for  this  purpose.  The  proposed  method  is  applied  to  the 
estimation  of  population  means  for  several  simulated  data  sets  and  one  familiar  real  world 
data  set.  Certain  possible  extensions  are  discussed. 
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1.  Introduction. 

The  purpose  of  this  paper  is  to  show  hew  an  old  idea  may  be  effectively  implemented 
using  new  technology.  The  old  idea  is  the  notion  of  fully  nonparametric  empirical  Bayes 
estimation,  which  was  introduced  by  the  author  in  a  paper  (Johns  1057)  directly  inspired  by 
the  fundamental  paper  of  Robbins  (1955).  The  new  technique  is  computer  based  projection 
pursuit  regression  analysis. 

The  fully  nonparametric  approach  to  empirical  Bayes  estimation  differs  from  the  origi¬ 
nal  Robbins  formulation  in  that  it  does  not  require  the  specification  of  a  parametric  family 
for  the  conditional  (sampling)  distributions  of  the  independent  component  populations. 
Neither  formulation  makes  parametric  assumptions  about  the  prior  distribution  of  the 
quantity  being  estimated.  This  is  in  contrast  to  the  case  of  “parametric”  empirical  Bayes 
estimation  (see  e.g.,  Efron-Morris,  1975)  where  parametric  models  are  specified  for  both 
the  conditional  and  prior  distributions,  and  the  “restricted*  case  where  the  estimators  are 
constrained  to  have  particular  simple  form  (see  Robbins  1983).  It  should  be  noted  that 
the  fully  nonparametric  version  of  the  problem  requires  that  at  least  two  observations  be 
obtained  from  each  component  population. 

When  the  empirical  Bayes  approach  was  first  introduced,  and  for  some  time  there¬ 
after,  it  seemed  that  application  of  the  methods  to  real  world  data  would  not  often  be 
feasible  because  of  computational  difficulties  and  the  possibility  that  a  very  large  number 
of  component  populations  might  be  needed  before  approximately  optimal  results  could  be 
obtained.  Indeed,  one  advantage  of  the  parametric  approach,  or  the  restriction  to  linear 
forms  of  estimation,  is  the  increased  capacity  to  deal  with  real  data  sets  of  modest  size  at 
the  cost  of  some  potential  loss  of  asymptotic  efficiency.  The  original  version  of  the  fully 
nonparametric  methodology  (Johns,  1857)  with  which  this  paper  is  principally  concerned, 
was  of  little  practical  use  in  a  world  where  large  scale  digital  computers  had  barely  appeared 
on  the  scene.  Fortunately,  the  present  widespread  availability  of  computational  power  and 
the  development  of  sophisticated  statistical  software  has  opened  up  new  possibilities. 

One  of  the  central  requirements  for  dealing  with  the  fully  nonparametric  empirical 
Bayes  problem  is  the  estimation  of  a  conditional  expectation  function  of  unknown  form 
involving  several  variables.  In  the  original  paper  (Johns,  1957)  a  pointwise  consistent 


estimator  was  proposed  based  on  successive  refinements  of  a  partition  of  d-dimensional 
space.  A  convergence  result  (Lemma  5),  which  in  a  later  incarnation  has  become  known 
as  the  generalized  Lebesgue  dominated  convergence  theorem,  was  then  used  to  show  con¬ 
vergence  to  the  Bayes  optimal  risk  for  the  proposed  empirical  Bayes  estimator.  Some  of 
these  results  could  be  regarded  as  primitive  precursors  of  the  more  recent  work  of  Stone 
(1981).  In  the  last  few  years  several  other  sophisticated  methods  for  the  nonparametric 
estimation  of  conditional  expectation  (regression)  have  been  proposed.  These  include  ker- 
nal  smoothers,  nearest  neighbor  estimates,  recursive  partitioning,  and,  notably,  projection 
pursuit  regression  as  proposed  by  Friedman  and  Stuetzle  (1981).  A  comprehensive  dis¬ 
cussion  of  projection  pursuit  methods  may  be  found  in  Huber  (1985)  where  it  is  noted 
that,  almost  alone  amoung  multivariate  procedures,  they  avoid  many  of  the  difficulties 
associated  with  high  dimensionality  and  the  presence  of  uninformative  observations. 

In  the  present  study  the  regression  aspect  of  the  fully  nonparametric  empirical  Bayes 
estimation  procedure  has  been  dealt  with  by  substituting  a  projection  pursuit  regression 
scheme  for  the  original  conditional  expectation  estimator.  The  particular  algorithm  used 
is  called  The  Smooth  Multiple  Additive  Regression  Technique  (SMART)  and  is  detailed 
in  Friedman  (1984).  In  section  2  the  problem  and  the  proposed  solution  are  described 
more  formally.  In  section  3  the  proposed  method  is  applied  to  several  data  sets  gener¬ 
ated  by  computer  simulation  and  the  results  are  discussed.  The  method  is  also  applied 
to  the  famous  Efron-Morris  baseball  data.  Section  4  contains  concluding  remarks  and 
acknowledgements. 

2.  The  Problem  and  the  Proposed  Method. 

We  consider  m  populations  from  each  of  which  n  observations  are  obtained.  Let  these 
observations  be  given  by 

Xii  =  the  ith  observation  from  the  jth  population, 

*  =  1>  2, ...,  nj  j  =  1,2, ...,  m. 

We  assume  that  for  each  j  the  are  iid  with  common  random  cdf  Fj,  where 

F% ,  F2, ...,  Fm  are  assumed  to  be  selected  independently  according  to  some  unknown  prior 


probability  measure  over  all  cdf’s.  Let  T(F)  =  a  real-valued  functional  defined  on  all 
cdf’s  which  represents  the  “parameter”  to  be  estimated  for  each  population  subject  to 
squared-error  loss,  i.e.,  9j  =  T(F3),  and  for  any  estimator  §3  the  loss  incurred  is  (§3  -  93)7. 
If  9  —  {9 1,62, 9m)  and  9  =  then  the  average  loss  for  the  m  component 

populations  is 

(1)  L(9, 9)  =  (9  —  9)(9  —  9)'  l  m. 

The  corresponding  average  risk  is  then 

(2)  R(9)  =  E{L(§,9)}t 


where  the  expectation  operator  E  reflects  the  randomness  in  the  selection  of  the  Fj* s  as 
well  as  the  Initially,  we  consider  functionals  of  the  form 


(3) 


T(F)  =  iMM*)}, 


where  h{  )  is  a  specified  function  and  X  has  cdf  F.  For  example,  if  the  quantity  we  wish 
to  estimate  is  the  mean  of  F  we  would  set 


x  dF(x). 


In  section  4  we  indicate  a  method  for  dealing  with  more  general  functionals. 

We  observe  that  for  each  j,  the  Bayes  optimal  estimate  of  93  —  T{F3)  under  squared- 
error  loss  is 

9j  =  E{9j\Xiit  1  <  *  <  n}. 


If  the  observation  Xkj  is  omitted  from  the  data  for  the  ;th  population  for  some  k,  1  <  k  < 
n,  then  the  corresponding  Bayes  estimator  for  93  is 


S,(k)  =  E{> 1  <  1  <  n,i  /  t}, 

=  1  <  •  <  n,  i  #  *}, 

(4)  =  B{*(Jrw)|Arjyi  1 

■*=  #(•?„,  1  <  <  < 


4 


where  <f>  is  a  fixed  symmetric  function  of  n  - 1  arguments  independent  of  j  and  k.  Since  <f>  is 
a  conditional  expectation  function,  it  may  be  estimated  using  any  suitable  nonparemetric 
regression  method  applied  to  the  data  from  all  m  populations.  To  make  maximum  use  of 
the  information  available  for  the  estimation  of  <f>t  we  may  organize  the  mn  observations  as 


follows: 


“Dependent*  “Independent” 


M* n)  *3i  i  *31 1  •  •  •  »*nl 

KX 21)  *11,  *31 ,  •  •  •  »*nl 


M*»l)  *11,  *31,  •  ••»*»— 1,1 

M* is)  *23,  *33,  •  •  •  ,*n3 


*3rn,  •  •  •  »*n—  1 


3ecause  of  the  symmetry  of  the  function  ^  we  should  increase  this  list  by  including  all 
permutations  of  the  “independent*  values,  but  this  may  be  avoided  by  first  ordering  the 
observations  from  each  population  so  that  *1  j  —  *2  j  ^  '  •  ^  Xnj  for  each  j.  This,  of 
course,  leads  to  a  different  (nonsymmetric)  regression  function,  say  V>,  which  is  defined 
only  for  ordered  arguments  but  contains  the  same  information  as  <j>.  Henceforth,  we  shall 

A 

assume  that  the  Xij* s  are  ordered  in  this  fashion.  If  represents  a  suitable  nonparametric 
regression  estimate  of  tf>  based  on  the  available  data,  then  the  proposed  empirical  Bayes 
estimator  of  Bj  i3 

1  n 

(6)  =  -^2  'PmiXi,,  1  <  i  <  n,  ■  t), 

A 

for  ;  =  l,2,...,m.  The  averaging  over  n  values  of  rp  indicated  in  (6)  results  in  a  slight 
improvement  in  the  performance  of  the  estimator  (see  (2.47),  p.656  of  Johns,  1957). 

The  original  formulation  of  the  fully  nonparametric  empirical  Baye3  estimation  prob¬ 
lem  considered  the  component  problems  in  sequence  and  concentrated  on  the  risk  for  the 


mth  problem  using  the  estimated  conditional  expectation  based  on  the  data  from  the  pre¬ 
vious  m  -  1  problems.  Strictly  speaking,  the  original  asymptotic  optimality  result  applies 
to  the  present  case  only  if  we  modify  the  procedure  indicated  above  so  that  for  each  ;  the 
estimate  of  ip  involves  only  data  from  the  other  m  -  1  component  problems.  Then,  for  the 

A 

modified  procedure  and  the  original  partition  estimate  of  ip,  if  we  let  9  be  the  vector  of 
9j's  given  by  (6)  the  following  result  holds: 

THEOREM  (Johns,  1957)  If  E{h7(X)}  <  go,  then 
(7)  B;  <  Jim  *„(«)  <  R;_, 

where  =  the  Bayes  optimal  risk  for  a  component  problem  with  sample  size  n,  and 
/2„(0)  is  the  average  risk  using  the  empirical  Bayes  estimator  9  where  the  sample  size  is  n 
for  each  component  problem. 

The  modified  procedure  i3  too  cumbersome  for  application  to  actual  data  since  it 
entails  repeated  estimation  of  the  function  ip.  It  seems  plausible  that  (7)  will  hold  for  the 
unmodified  procedure  based  on  any  well  behaved  estimator  of  the  function  ip  for  which  the 
pointwise  convergence  in  probability  to  ip  as  m  becomes  large  is  asymptotically  unaffected 
by  the  values  of  the  Xu'  3  for  any  fixed 

In  applications,  if  n  is  large  and  m  is  not  very  large,  the  estimate  of  rpm  may  be  unstable 
and  it  may  be  desirable  to  substitute  a  summary  statistic  of  lower  dimension  for  the  n  —  1 
arguments  of  ip.  If  this  summary  statistic  is  well  chosen  the  resulting  loss  of  asymptotic 
efficiency  may  be  slight.  One  possibility  would  be  to  replace  the  conditioning  Xij’a  by  a 
two  dimensional  statistic  consisting  of  robust  estimators  of  location  and  scale.  In  some  of 
the  examples  considered  in  the  present  paper,  a  less  drastic  reduction  in  dimension  has 
been  obtained  by  replacing  the  n  —  1  ordered  Xu  6  by  d  averages  of  s  successive  ordered 
values  where  ds  =  n  -  1.  It  may  be  shown  (see,  e.g.,  Johns  1974)  that  such  averages 
of  blocks  of  order  statistics  retain  most  of  the  sample  information  about  the  underlying 
distribution. 


As  was  mentioned  in  the  introduction,  the  method  used  to  estimate  the  required 
conditional  expectation  in  the  present  study  is  the  SMART  algorithm  of  EYiedman  (1984). 
Given  a  number  of  iid  observations  of  a  dependent  variable  Y  and  the  corresponding  values 


of  “independent*  variables  Xi,  X2, ...,  Xp,  the  algorithm  estimates  E{Y\Xi,X^) ,  Xp} 
nonparametrically  by  an  expression  of  the  form 

(8) 

r=  1 

where  X  =  {XltX^ . Xp)  and  a  =  (a*, a2,...,ap).  The  s,  and  the  functions  fr  ()  are 

suitably  normalized  to  avoid  identifiabiliiy  difficulties.  The  a,-’s,  /Vs,  fr  ()*s  and  number 
of  terms  in  (8)  are  chosen  to  satisfy  a  least  squares  criterion,  where  the  functions  are 
generated  by  a  variable  span  smoother. 


3.  Examples. 

The  proposed  nonparametric  empirical  Bayes  estimation  procedure  incorporating  the 
SMART  algorithm  as  implemented  on  a  VAX1 1/750  computer  was  applied  to  six  sets  of 
simulated  data  and  one  set  of  real  data.  For  each  example,  the  quantities  being  estimated 
(  i.e.,  the  0/s)  are  the  means  of  the  component  populations.  The  simulated  data  sets 
consist  in  each  case  of  either  50  or  100  component  populations.  These  numbers  are  perhaps 
larger  than  would  be  expected  in  some  applications  to  real  world  data  but  were  chosen 
to  yield  reasonably  stable  and  interpretable  results.  The  sample  sizes  associated  with  the 
component  problems  are  5  or  6  for  the  100  component  cases  and  11  for  the  50  component 
cases. 

The  conditional  distributions  are  either  normal  with  mean  =  0  and  standard  deviation 
=  oy  or  logistic  with  mean  =  0  and  scale  =  <r.  The  prior  distributions  for  0  are  either 
normal  with  mean  =  /x  and  standard  deviation  =  r,  or  the  longtailed  distribution  having 
density 

V* 


(9) 


9(0)  = 


ir(l  +  04) 


This  distribution  has  mean  =  0  and  standard  deviation  =  1.  For  two  examples  the  scale 
parameter  <r  for  the  conditional  distribution  was  chosen  randomly  from  three  possible 
values.  The  summary  statistic  on  which  the  predicted  values  of  0  are  based  is  either  ail 
n  —  1  available  observations  or,  for  n  =  11,  the  set  of  five  averages  of  two  adjacent  order 


statistics.  The  setup  for  each  of  the  six  cases  simulated  is  given  in  Table  1. 

TABLE  1 
Cases  Simulated 


Case 

Conditional 

Prior 

No.of  Sample 

Prior 

Prior 

Cond. 

Summary 

Label 

Distr. 

Distr. 

Pops. 

Size 

Mean 

S.D. 

Scale* 

Statist. 

(for  9) 

(m) 

(n) 

(M) 

(0 

(*) 

(a) 

Normal 

Normal 

100 

5 

25 

2 

2,4,6 

all  4  obs. 

(b) 

Normal 

Normal 

100 

5 

25 

2 

4 

all  4  obs. 

(c) 

Normal 

Normal 

50 

11 

25 

2 

6 

5  avgs. 

(d) 

Normal 

Longtail 

100 

5 

0 

1 

2 

all  4  obs. 

(e) 

Logistic 

Normal 

100 

6 

0 

2 

3 

all  5  obs. 

(0 

Logistic 

Longtail 

50 

11 

0 

1 

4,5,6 

5  avgs. 

*  Each  value  has  equal  prior  probability  and  is  independent  of  9. 

TABLE  2 

Summary  of  the  Simulation  Results 


Case 

Conditional  Prior 

Bayes 

Asymptotic 

Observed 

Observed 

Label 

Distr. 

Distr. 

Opt.  Risk 

BLUE  M.S.E. 

BLUE  M.S.E. 

EB  M.S.E. 

(for  9) 

(Approx.) 

(a) 

Normal 

Normal* 

1.67 

3.73 

4.29 

1.98 

(b) 

Normal 

Normal 

1.78 

3.20 

3.32 

1.57 

(c) 

Normal 

Normal 

1.80 

3.27 

3.38 

2.58 

(d) 

Normal 

Longtail 

0.44 

0.80 

0.69 

0.45 

(e) 

Logistic 

Normal 

2.12 

4.50 

4.50 

2.69 

(f) 

Logistic 

Longtail* 

0.86 

7.00 

6.98 

3.40 

*  The  values  of  sigma  are  selected  randomly  from  among  three  values. 

The  numerical  results  obtained  from  the  six  simulations  are  summarized  in  Table  2. 
The  last  column  shows  the  actual  mean  squared  error  (M.S.E.)  produced  by  the  fully 


nonparametric  empirical  Bayes  procedure.  For  comparison  purposes  both  the  average  ob¬ 
served  variances  and  the  true  (asymptotic)  variances  for  the  best  linear  unbiased  estimators 
(BLUE’s)  are  shown.  For  the  normal  cases,  of  course,  the  BLUE  is  simply  the  sample  mean. 
Approximate  values  for  the  Bayes  optimal  risk  are  also  given.  These  are  based  on  linear 
Bayes  estimators  and  asymptotic  variances  so  they  are  only  exact  for  cases  (b)  and  (c) 
where  both  the  conditional  and  the  prior  distributions  are  normal.  It  is  encouraging  to 
note  that  the  empirical  Bayes  M.S.E.  is  substantially  smaller  than  the  BLUE  variance  for 
each  of  the  examples.  Furthermore,  the  empirical  Bayes  M.S.E.  is  in  the  vicinity  of  the 
Bayes  optimal  risk  for  all  cases  but  one  (example  (/)). 

The  actual  regression  functions  produced  by  the  SMART  algorithm  are  plotted  in 
Figures  1  and  2.  In  all  cases  the  algorithm  concluded  that  only  a  single  function  /i  was 
required  in  expression  (7)  for  an  adequate  description  of  the  data.  When  interpreting  the 
plots  it  should  be  borne  in  mind  that  a  different  direction  vector  a  is  associated  with  each 
function.  The  vector  X  represents  the  appropriate  set  of  “independent”  variables. 
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FIGURE  1 

SMART  Regression  Functions 


FIGURE  2 

SMART  Regression  Functions 


•X' 


We  observe  that  the  plots  are  quite  linear  for  all  cases  with  normal  conditional  dis¬ 
tributions  but  distinctly  nonlinear  for  the  logistic  cases.  It  was  thought  that  example  (a) 
might  yield  a  nonlinear  regression  because  of  the  random  prior  on  o.  A  numerical  calcu¬ 
lation  of  the  actual  conditional  expectation  of  the  mean  given  the  sample  mean  and  the 
sample  variance  verified  that  the  regression  surface  was  in  fact  fairly  linear.  A  plot  of  this 
surface  evaluated  at  a  set  of  grid  points  is  shown  in  Figure  3. 

An  actual  real  world  data  set  was  also  analyzed  using  the  fully  nonparametric  em¬ 
pirical  Bayes  scheme.  The  data  was  obtained  from  Efron-Morris  (1975)  and  consists  of 
the  batting  averages  for  18  major  league  baseball  players  for  their  first  45  times  at  bat 
and  their  averages  for  the  remainder  of  the  season  which  represent  the  ‘true’  values  one 
wishes  to  predict.  Efron-Morris  first  transform  the  data  to  approximate  normality  using 
the  arcsine  transformation.  They  then  compute  the  Stein  estimator  (Stein,  1955)  and  their 
own  proposed  estimator  based  on  a  linear  empirical  Bayes  formula  modified  to  limit  the 
maximum  component  risk.  The  results  are  then  converted  back  to  proportions.  For  the 
present  study  the  data  was  considered  in  its  original  form  as  a  set  of  Bernoulli  observations 


-£(*-*)* 
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(hits  or  non-hits)  and  the  fully  nonparametric  empirical  Bayes  method  was  applied.  The 
results  are  shown  in  Table  3.  The  third  column  gives  the  maximum  likelihood  estimate 
(MLE)  which  is  just  the  observed  proportion  of  hits  in  the  first  45  at  bats.  The  nonpara¬ 
metric  empirical  Bayes  estimate  is  given  in  the  fourth  column  and  Stein’s  estimate  in  the 
fifth.  The  Efron-Morris  limited  risk  estimate  with  index  .8  is  given  in  the  last  column. 
The  corresponding  mean  squared  errors  of  prediction  are  shown  in  the  last  row. 


TABLE  3 


Batting  Averages  and  Their  Estimates 


i 

‘TRUE’  MLE 

NP-EB 

STEIN 

EMEST(.8) 

1 

.346 

.400 

.306 

.290 

.351 

2 

.298 

.378 

.293 

.286 

.329 

3 

.276 

.356 

.281 

.281 

.308 

4 

.222 

.333 

.269 

.277 

.287 

5 

.273 

.311 

.256 

.273 

.273 

6 

.270 

.311 

.256 

.273 

.273 

7 

.263 

.289 

.247 

.268 

.268 

8 

.210 

.267 

.247 

.264 

.264 

9 

.269 

.244 

.254 

.259 

.259 

10 

.230 

.244 

.254 

.259 

.259 

11 

.264 

.222 

.258 

.254 

.254 

12 

.256 

.222 

.258 

.254 

.254 

13 

.303 

.222 

.258 

.254 

.254 

14 

.264 

.222 

.258 

.254 

.254 

15 

.226 

.222 

.258 

.254 

.254 

16 

.285 

.200 

.266 

.249 

.242 

17 

.316 

.178 

.274 

.244 

.218 

18 

.200 

.156 

.283 

.239 

.194 

M.S.E. 

.00419 

.00105 

.00120 

.00139 

We  observe  that  the  procedure  proposed  in  this  study  has  the  smallest  mean  squared 
error  of  prediction  and  does  better  than  the  Efron-Morris  estimator  in  three  out  of  the 
five  cases  (t  =  1,2,3,17, 18)  where  their  procedure  limits  the  risk.  The  highly  nonlinear 
regression  function  which  SMART  produces  for  this  case  is  plotted  in  Figure  4.  The 
abscissa  of  this  figure  is  a  linear  function  of  the  number  of  hits  in  44  at  bats. 


FIGURE  4 

SMART  Regression  Function 


4.  Concluding  Remarks. 

The  estimation  procedures  discussed  here  may  be  modified  and  generalized  in  various 
ways.  We  may  expect  that  ever  more  sophisticated  nonparametric  regression  methods  will 
be  developed.  Such  procedures  may  then  be  substituted  for  the  projection  pursuit  part 
of  the  scheme.  The  empirical  Bayes  problem  described  here  assumes  equal  sample  sizes 
for  all  component  populations.  The  case  of  unequal  sample  sizes  may  be  dealt  with  by 
various  ad  hoc  methods  some  of  which  are  discussed  in  the  original  paper  (Johns,  1957). 
The  question  of  the  best  way  to  proceed  in  such  cases  is  still  open. 

In  the  preceding  sections  the  quantities  to  be  estimated  were  required  to  be  represented 
as  functionals  of  the  form  (3).  However,  within  this  framework  we  may  estimate  the 
conditional  cdf  F{t)  for  any  fixed  t  by  letting  h(x)  —  the  indicator  function  of  the  interval 
(— oo,  t].  Since  F(t)  can  be  recaptured,  it  should  be  possible  modify  the  procedure  to 


permit  the  estimation  other  functionals  T(F)  such  as,  e.g.,  the  median  of  F. 

As  is  true  of  most  empirical  Bayes  problems,  the  present  one  may  be  reinterpreted 
as  a  compound  decision  problem  by  dropping  the  assumption  of  the  existence  of  a  prior 
probability  distribution,  and  replacing  it  with  a  suitable  empirical  distribution  of  unknown 
quantities.  In  the  present  case  these  quantities  are  the  component  cdf’s  FX,F2, ..., Fm. 
Presumably  results  paralleling  the  empirical  Bayes  results  would  be  forthcoming  here  as 
in  previously  considered  problems.  (See  Robbins  (1951)  for  the  original  formulation  of  the 
key  ideas  and  Gilliland  (1968)  and  Johns  (1967)  for  some  further  developments.) 

The  SMART  algorithm  used  in  the  applications  considered  in  this  study  requires  the 
specification  of  certain  operating  parameters.  The  most  significant  of  these  was  found  to 
be  the  span  parameter  controlling  the  variable  span  smoother.  This  was  assigned  a  value 
of  either  0.6  or  0.7  for  all  of  the  examples  considered. 

Finally,  the  author  wishes  to  express  his  thanks  to  David  J.  Pasta  who  rendered 
invaluable  assistance  in  the  application  of  the  SMART  algorithm  to  the  data  of  this  study. 
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